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Abstract 

We consider the genealogy of a sample of individuals taken from a spatially structured popula- 
tion when the variance of the offspring distribution is relatively large. The space is structured 
into discrete sites of a graph G. If the population size at each site is large, spatial coalescents 
with multiple mergers, so called spatial A-coalescents, for which ancestral lines migrate in space 
and coalesce according to some A-coalescent mechanism, are shown to be appropriate approxi- 
mations to the genealogy of a sample of individuals. 

We then consider as the graph G the two dimensional torus with side length 2L + 1 and show 
that as L tends to infinity, and time is rescaled appropriately, the partition structure of spatial 
A-coalescents of individuals sampled far enough apart converges to the partition structure of 
a non-spatial Kingman coalescent. From a biological point of view this means that in certain 
circumstances both the spatial structure as well as larger variances of the underlying offspring 
distribution are harder to detect from the sample. However, supplemental simulations show that 
for moderately large L the different structure is still evident. 

Keywords: spatial Cannings model, coalescent, A-coalescent, spatial coalescent, two 

dimensional torus, limit theorems 
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1. Introduction 

The goal of this article is to study the genealogies of a sample of individuals from a spatially 
structured population when the variance in the number of each individual's offspring is relatively 
large. Larger variances in the offspring distribution are thought to arise, for example, due to 
particular reproduction mechanisms of various species leading to the existence of few individuals 
with many offspring (Eldon and Wakeley, 2006), and also due to recurring selective sweeps 
(Durrett and Schweinsberg, 2004, 2005) . 

The space is structured into discrete sites of a graph G with a colony of a fixed number of 
individuals at each site in G as well as migration between sites. As the underlying population 
models we introduce spatial Cannings models, which are extensions of the stepping stone model 
with general Cannings type offspring distributions. 
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The genealogies are modeled by coalescent processes that code for the ancestral lines of the in- 
dividuals in a sample from the present day population backwards in time. Coalescence -referring 
to a merger of ancestral lines- occurs when a common ancestor of various individuals is reached. 
We consider coalescent models that are appropriate if the population at each site of G is large. 
Our special focus in the analysis lies then on models where the number of sites |G| in the graph 
is finite but also large, more precisely, we choose G to be a large two dimensional torus. 

The special case when one considers only one site (the non-spatial situation with \G\ = 1) 
leads to the classical Kingman coalescent with only binary mergers provided that the variance of 
the offspring distribution stays bounded in some sense as the population size N tends to infinity. 
This coalescent process has been well studied since its introduction by Kingman (1982a; 1982b), 
see for example Wakeley (2009) for an overview. 

Genealogies for populations with a larger variance in the number of offspring have been stud- 
ied by Mathematicians and Biologists in the more recent past. In this case, the genealogy is 
described by the coalescent with multiple mergers, the so called A-coalescent, which was in- 
dependently introduced by Pitman (1999) and Sagitov (1999). (More generally, one can even 
consider coalescents with simultaneous multiple mergers, the so called S-coalescents, but we 
will focus here on the subclass of A-coalescents.) Here, A is a finite measure on [0, 1]. When 
there are currently b distinct ancestral lines then any collection of k ancestral lines coalesces and 
thus merges into one new ancestral line at rate 

h, k := f z k - 2 (l-z) b - k A(dz) (1) 

J[0,1] 

with 2 < k < b, k, b e N. We formally extend this definition by setting A^ = for b = 1 or 
b — 0, k e N. The Kingman coalescent, that is appropriate for populations in which the offspring 
variance is not so large, corresponds to the case A = Sq, the delta measure at 0. In this case Ab,k 
is only nontrivial if k = 2 and we obtain Aba = 1 so that the total coalescence rate is w\ This 
means that only binary mergers of pairs of ancestral lines are possible and happen at rate 1 per 
pair of ancestral lines. 

In order to extend this to a spatial or structured coalescent on a graph G we imagine that 
ancestral lines migrate independently from site x to site y in G with rates p(x, y) but coalesce 
independently at each site according to the rates given in Q. The formal definition of this 
spatial K-coalescent can be found in Section [2] and in particular in Definition |2.1| In the case 
of Kingman coalescence at each site the resulting process is often simply called the structured 
coalescent, which was derived rigorously from a forward population model by Herbots (1997) 
but already studied earlier by Notohara (1990) and since then by many others, see for example 
Donnelly and Kurtz (1999) or Greven et al. (2005). The case when coalescence at each site takes 
place according to a A-coalescent, was included in a setting considered by Donnelly and Kurtz 
(1999), and was studied in more detail by Limic and Sturm (2006). 

After introducing the spatial Cannings model in Section|3]we show that spatial A-coalescents 
arise when one considers the genealogy of individuals sampled in those spatial Cannings models, 



provided that the number of individuals at each site is large, see Proposition 3.1 for a precise 
formulation of the result. This result is a rather straightforward generalisation of the work of 
Herbots (1997) on the derivation of the structured coalescent combined with the work of Mohle 
and Sagitov (2001) on the derivation of A-coalescents from non-spatial Cannings models, which 
is included here for completeness. 

In Section [4] we consider the spatial A-coalescent on the two dimensional torus T L = 



[-L, L] 2 nZ 2 with side length 2L + 1 . In our main result, Theorem 4.1 we show that as L tends to 

2 



infinity, and time is rescaled appropriately, the partition structure of spatial A-coalescents of in- 
dividuals sampled far enough apart converges to the partition structure of a non-spatial Kingman 
coalescent. This extends in particular the work of Limic and Sturm (2006), which dealt with the 
case d > 3. The two dimensional case considered here is biologically the most relevant. It also 
needs to be treated differently mathematically. 

Since the information in a genetic sample only depends on the partition structure of the ge- 
nealogy of the sampled individuals as well as on some (independent) mutation process along the 
ancestral lineages, the distribution of quantities that can be read off a sample of genes, such as 
allele frequencies, will be the same if the partition structures of the genealogies are the same. 
(Note however that the time rescaling that is used in Theorem |4T| depends explicitly on L.) 

From a biological point of view, Theorem 4. 1 thus implies that spatial structure as well as 
larger variances of the underlying offspring distribution are harder to detect from a sparse sample 
on a large space. However, we also conduct simulations that show that for moderately large L 
the different structure is still evident in the frequency spectrum of a sample. We postpone the 
detailed description of relations of our main limit result to the existing literature as well as further 
discussion of the analytical as well as simulation results to Section [5] 



2. The spatial A-coalescent 

In this section, we rigorously define the spatial A-coalescent. We start with introducing some 
necessary notation. First, in order to not only describe the number of ancestors of a sample 
of size n at a given time, but the structure of the genealogy, it is convenient to consider the 
coalescent as partition valued. We label the sampled individuals from 1 to n and want to describe 
the corresponding n-coalescent, first in a non-spatial setting. 

Here, we consider as the state space P n , the partitions of [n] :-{!,..., n). We call the elements 
of a partition it € r P n blocks and may represent n uniquely by 

n '■= (B\,B 2 , ...), (2) 

where B/ c [n] with £ = [n] are the blocks indexed by the order of their minimal element (with 
the convention that min© = oo). Each block represents the common ancestor of the individuals 
contained in this block. Initially, the process is generally started with all singletons, so in the state 
({1}, . . . , {«}, 0, . . . ). At each coalescence event an appropriate collection of blocks is merged into 
one new larger block. 

A-coalescents can more generally be defined on the state space P, the partitions of all of N 
with representations as in Due to the exchangeability of all blocks an n-coalescent is obtained 
from this coalescent started with infinitely many individuals if one restricts the attention to the 
partition structure induced on a subset of N of size n. 

For describing the spatial distribution of ancestral lineages of samples of size n e N we con- 
sider as the state space for the spatial A-coalescent labeled partitions V e n of [«]. This is the set of 
partitions of [«], for which each block carries a label in G specifying its location. To make the 
notation more precise, we will represent an element ti c e uniquely by 

« / :=((2?i,f 1 ),(/? 2 ,6),...), 0) 



where the sequence of blocks is ordered as before and (;6G,(eN are the labels of the ordered 
blocks. Set & = d £ G if B, = 0. 
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If we are interested in the genealogy of samples of infinite size we consider labeled partitions 
P c of N. Elements of V c are also uniquely described by (|3]l with B/ c N, where £ B/ = N and 

e G. For n l in P { „ or P c of the form ((3} we will write 7r for the corresponding unlabeled 
partition n := (B\,B2, ...) inP„ or !P respectively. We will denote by #7r = #7/ the number of 
(nonempty) blocks in the partition, which may be finite or infinite. Also write #n c x for the number 
of blocks with label x e G. 

For any element n c e f e n or P c with n > m define n e \ m e r P t m as the labeled partition induced 
by tt on P^, We equip ( P t n with the metric 

4(7T°y- 2 )= sup 2-" , l,^, u ^ u) , (4) 

m£[n] 

and likewise !P f with the analogous metric d, where we replace [n] with N. It follows that 
CP c n ,d„) and (P e ,d) are both compact Polish spaces (for finite G), and that d(n e - x ,ji e ' 2 ) = 
sup n d n (7r [,1 \„,7T e ' 2 \ n ). Analogously, we defined a metric on the unlabeled partition spaces P„ 
and P by omitting all the superscripts I in the above. 

We are now ready to rigorously define the spatial A-coalescent. Let D{M.+ ,E) be the space of 
right continuous functions from R + into a metric space E that also have left limits. We equip 
the space D(R + ,E) with the Skorohod topology, see for example Chapter 3 of Ethier and Kurtz 
(1986) for details. 

Definition 2.1. The spatial A-coalescent Yl e with parameter A = (A x ) x€ g where A x are finite 
measures on [0, 1] is a D(M.+,P e )-valued process with the following dynamics: 

( i) Blocks with the same label x coalesce to create a new block with that same label according 
to the (non- spatial) A x -coalescent independently from blocks at other sites, meaning that 
coalescence happens with the rates given in ([7]) for A — A x , 

( ii) The label of each block performs an independent random walk on G with the migration rate 
from site x to site y given by p(x, y),x,y € G. 

It is shown in Limic and Sturm (2006) Theorem 1 that the spatial A-coalescent is a well- 
defined strong Markov process for any initial condition in P c in the case that A t are the same 
measures for all x e G and |G| < 00. However, the construction can easily be extended to this 
more general setting and also to |G| countably infinite (some care must be taken here for starting 
configuration with infinitely many blocks, but we will not need this here). As mentioned earlier, 
the special case for which A A is given by 60 up to a constant for all x e G is called the structured 
coalescent (or spatial Kingman coalescent). 

3. Spatial Cannings models and derivation of spatial A coalescents 

In this section we want to present a class of simple spatial population model, which we will 
refer to as spatial Cannings models. We will show that the spatial genealogy of a sample of 
individuals from various sites is described by the spatial A-coalescent of Definition |2.1| in the 
large population limit. Here, the number |G| of sites in the graph may be finite or countably 
infinite. The convergence result combines work by Herbots (1997), who considered convergence 
to the structured coalescent, with results on convergence of genealogies to A-coalescents in the 
non-spatial setting from Mohle and Sagitov (2001). 
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In the spatial Cannings model there are N x := r x N e N individuals at site x e G at any 
time, where r x is a constant. These populations reproduce within their own site at discrete times 
s 6 No := {0, 1, 2, ... } and immediately after birth disperse their offspring to other sites. Both 
the reproduction as well as the dispersal or migration mechanism are chosen such that they leave 
the population size at any site fixed. The offspring law at site x e G is described by 

v x = {v\y 2 ,...,v x N ), (5) 

where v\ is the number of offspring of individual I in the previous generation (all at site x). In 
order to keep the population sizes constant over time we assume that *j v? = N x . 

For the offspring distributions v* we will furthermore assume that they are exchangeable such 
that for any permutation cr of the N x indices we have 

«4-.4 t ) = «„,^ ) .-.<»,,,)- ( 6 ) 

Cannings (1974) first considered non-spatial versions of this model with offspring distributions 
satisfying these properties. 

Due to the exchangeability and the criticality of the reproduction we have E(vf) = 1. The 
probability that any two individuals at site x have a common ancestor in the previous generation 
is given by 

V V TV I < (V / - !) \ _ E((V 1 )2) _ VClr(V l ) 

C Y 



1=1 



= V e I I = 1 = 1 m 

' \n x (n x -i)) n x -\ n x ~i' 



where we are denoting (rn\ := ™L . After reproduction a fixed number of offspring n xy selected 
at random without replacement from site x migrate to site y for all x, y e G. Thus, for each x e G, 
we need that £yeG n xy < N x . In order to keep the population sizes at all sites fixed even after the 
dispersal due to migration we also require balancing migration, meaning that 



Taking reproduction and migration together specifies the spatial Cannings model fully. We note 
that the special case considered by Herbots (1997) corresponds to Wright-Fisher reproduction in 
all colonies (which are taken to be of the same size N) such that v x of |5]) is given by a symmetric 
multinomial distribution for all x. 

In order to study the associated genealogies of a sample we still need the following notation. 
After reproduction and migration, the proportion of individuals in site x who were born in site y 
is given by 

n yx N y q yx r y 

p « = t = = T*» (9) 

where q yx is the proportion of individuals in site y leaving for site x. We also set p x = Yi y *xPxy 
which is the proportion of individuals at site x that migrated there. 

Now let the spatial Cannings-coalescent Yl e - N = (Jx s ' N ) s m a be derived from a spatial Cannings 
model such that this valued process is obtain from sampling n individuals at the present, 
whose location is represented by n c e f^ n , and then following their genealogy into the past. 



In order to obtain a spatial A-coalescent in continuous time from Definition 2.1 in the large 
population limit as jV — > oo we will rescale time for this process as a function of and let 
jV — > oo. Set 

< N = ^ do) 
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where c N is related to c% and specified later on. Naturally, if are independent of x e G we will 
set c N = c%. This would in particular be the case in a spatially homogeneous situation in which 
all colonies are of the same size and display the same reproductive behavior, meaning that r x - 1 
so that N x - N and v N - v x - N for all sites xeG. 

Further, we will need an asymptotic moment condition as in Mohle and Sagitov (2001) stating 
that 

fAku. ■ ■ , *;) := lim (A^ + --- +fc W c JV r i E ((vf \ • ■ ■ (vf) kj ) (1 1) 

exist for all x e G, j e N and k\, . . .,kj > 2. We are now ready to state a convergence in 
distribution result for the spatial Cannings-coalescent. 

Proposition 3.1. Assume that there exists a sequence {c^Ia^n c R + with lirriAi^ M cr = and 

constants c x such that for any x e G we have lim^^oo — c x with sup^g c x < oo. Also assume 

p n . 

that limA(_, M — p(x,y) where sup xeC Yjy^xP( x ^y) < 00 ■ Assume further that (f>\(k) exist for all 
x € G with sup teC (p x x (k) < oo for all k > 2 as well as 

<P X 2 (2,2) = Urn (A^r^vf ) 2 • (vf) 2 ) = (12) 

for all x e G. For n e N and 7i e e let Tl e,N be the spatial Cannings-coalescent started in n f . 
Then we obtain weak convergence 

(Ilff => (nfW inD(R + y n ) (13) 



as N — > oo, w/zere II is f/ie spatial K-coalescent as in Definition 2.1 also started in n with A x 
characterized by the moments given by 

f z k - 2 A x (dz) = c x cf> x (k), k>2. (14) 
Jo 



The condition ( 12 1 implies that 0^ = for j > 2 and all x e G by monotonicity in j, see (18) 
of Mohle and Sagitov (2001). This also implies immediately that we obtain convergence to the 
spatial Kingman coalescent if and only if 

#P) = lim (N^T'Eiiv^h) = 0. (15) 



As mentioned earlier, a special case of Proposition 3.1 was proved by Herbots (1997). For the 
Wright-Fisher reproduction she considers we set c N — jj. Then under the same assumptions 



Herbots shows convergence of the number of blocks on the space D(M + ,N' C '). Since (15i is 
satisfied, the limit is the block counting process of the structured coalescent. Here, we consider 
convergence of spatial partition valued processes in a more general setting. 



Proposition 3.1 in the non-spatial situation for which \G\ = 1 was proved by Mohle and Sag- 
itov (2001). They considered convergence of partition valued processes for even more general 
offspring distributions and proved that the limiting process n is a coalescent with simultaneous 



multiple collisions, or S-coalescent, if and only if ( 11 1 is satisfied. Proposition |3. 1 | can easily be 



extended to those settings as well but since we will, for ease of notation, only be dealing with 
spatial A-coalescents later on we restrict out attention to this subclass. 
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4. Behavior of the spatial A-coalescent on a large 2-dimensional torus 

From now on let G — T and in order to record the dependence on L we will write P„' and 
f^' L for the partitions with labels in T L . We will for notational simplicity assume that A x = A is 
constant for all x e T L and that A([0, 1]) > 0. (It really suffices to assume that A 22 is bounded 
above and below by positive constants uniformly in x e T L .) In order to specify the migration on 
the torus we first consider transition rates p : 1? — > [0, 1] and define for all L e N transition rates 
p L of a random walk on T L by 

p L (x,y)= Yj ft***)- W 

zelz'eZ 2 | z'-y mod 2L+1=0) 

This means that ancestral lines that "migrate out" on one side of T L will "migrate in" again 
on the other side. To simplify the arguments in the following we will assume that the rates p 
are those of a simple symmetric random walk on Z 2 that migrates to each of the four nearest 
neighbour sites with equal rates |. However, any spatially homogeneous, symmetric random 
walk satisfying a suitable moment condition could also be considered (see also the remark at the 
end of this section). 

On T L we also define the following metric r L appropriate for the torus, 

r L (x,>0:=inf{||x-z|| 2 |zeZ 2 ,y-zmod2L+l =0), (17) 

where ||x|| 2 := ^x\ + x\ for x el?. Furthermore, we define for < a < b the set of all labeled 
partitions with pairwise distances in [a, b\. 

[[a,b]] := {/ = ((Bi, ft), (B 2 , &),... )epf| (18) 
r L {£i, ft) e [a, b] for all i + j with B„ Bj + 0). 

In order to describe the asymptotic behavior of the spatial A-«-coalescent started with n individ- 
uals we need some more notation: For n e P n let 

be the non-spatial Kingman coalescent on [n] started in ji. We also define the sequence (sz.)l<=n> 
which will be used to rescale time, by 

s L := (2L+ l) 2 log(2L+ 1). (20) 

Our main theorem then states the following. 

Theorem 4.1. Let n > 2 and let (^l)lgn be a nonnegative sequence with 
lim ZT 1 -0ogL fl£ = oo, lim L~ l a L = 

L— >oo L— >oo 

and a L < V2L for all LeN. Also, let tt q e P„ and n e ' L € pf; L such that 

n e ' L e [[a L , V2L]] 

and n L = ttq for all L e N large enough. For L e N let Il e ' L be a spatial K-n-coalescent started 
in n e ' L . Then, we obtain weak convergence in the Skorohod space D(R+,P„) for L — > oo, more 
precisely 
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Note that we will generally assume that L is large enough so that tt c,l e P C „' L corresponding to 
a given ttq e P n can be found. Observe also that in the limiting non-spatial Kingman coalescent 
there is an additional time change by the number n. The time change and so the entire limit 
process does not depend on the details of the coalescence mechanism of the spatialA-coalescent. 

Finally, we remark that the result of Theorem |4.1| is also expected to hold if the random walk 
p on Z 2 is symmetric and spatially homogeneous, and satisfies a suitable moment condition. In 
this case, the time scaling by n in (21 1 would have to be replaced by o- 2 n with a 2 the variance of 
distance traversed by the random walk in one unit of time. 



5. Discussion 

Knowledge of the genealogies together with an independent mutation process, in our context 
generally a Poisson process along the ancestral lineages, allows to describe the distribution of 
quantities of interest in population genetics that can be read off the genetic variability in the 
sample, such as site or allele frequency spectra. 

In the non-spatial situation many results are available for the Kingman coalescent case mod- 
eling populations with low offspring variance, a prominent example being the Ewens sampling 
formula for the allele frequency spectrum, see for example Wakeley (2009) for an overview. For 
non-spatial A-coalescents modeling populations with larger offspring variances the analysis is 
more complicated and the available results are not as complete or explicit. However, see for 
example Mohle (2006a; 2006b), Berestycki et al. (2007), Birkner et al. (2011), and Berestycki 
et al. (unpublished manuscript) for a variety of recent results that show that differences in the 
underlying reproduction can lead to qualitatively different sampling distributions. 

Likewise, the analysis of genetic variability in the sample becomes much more difficult due to 
spatial structure. Generally, the influence of the spatial structure will again lead to qualitatively 
different sampling distributions. However, this depends on the underlying space and in particular 
on the distances of the sampled individuals. 

Our main results state that the genealogy of individuals sampled far enough apart on a large 
two dimensional torus T L can be approximated by the genealogy of a non-spatial Kingman 
coalescent even when the offspring variances in the underlying population models are larger. 
On one hand, this means that the results available for the non-spatial Kingman coalescent can 
be used to approximate the genealogy and the resulting sampling distributions. However, from 
the point of view of a population geneticist it is a negative result: If one samples individuals 
relatively far apart then the influence of a large variance in the offspring distribution as well as of 
spatial structure are harder to detect in the sample. 

In order to derive this result we consider spatial A-coalescents on T L , which model the ge- 
nealogies of individuals sampled from large populations living and migrating on T L whose off- 
spring distributions have potentially larger variances. In Section [5] we introduced spatial Can- 
nings models as a large class of models that fit into this framework, a statement that is made 



precise by Proposition 3.1 



The main result, Theorem 4. 1 then states that the (unlabeled) partition structure of the suitably 
time changed spatial A-n-coalescents on T L converges to that of a non-spatial Kingman coales- 
cent as L — » oo provided that the individuals are sampled far enough apart. This kind of behavior 
arises since in the chosen scaling and due to the sparse sample it is unlikely that more than two 
ancestral lines, represented by blocks, ever meet at the same site. Thus, only binary mergers may 
take place. On the other hand, a meeting of two ancestral lines is followed up by many more 



meetings of these two so that they eventually coalesce (regardless of the rate of coalescence) be- 
fore encountering any other ancestral lines. The sequence of meetings of two ancestral lines and 
thus also their coalescence happens instantaneously as L — > oo, which implies that for large L the 
spatial A-n-coalescent behaves like a coalescing random walk (with instantaneous coalescence 
of lines that have met). In contrast, the time between encounters and eventual coalescence of 
pairs of lines is long enough so that all ancestral lines have in the meantime become well mixed 
on the torus. Hence, any two of them are equally likely to participate in the next meeting, which 
leads in the limit to the exchangeability property of the non-spatial Kingman coalescent. 

The result of Theorem 4. 1 is analogous to one obtained by Limic and Sturm (2006) for the d 
dimensional torus with d > 3. However, the scaling for d — 2 is different and more subtle. This is 
due to the fact that the random walk performed by the ancestral lines is recurrent in d — 2 while 
it is transient for d > 3. The recurrence in two dimensions also leads to the many encounters 
and in the limit L — > oo to instantaneous coalescence of a pair of lines that have met at the same 
site. As a consequence the time change of the limiting Kingman coalescent is independent of the 
A-measure of the underlying coalescent mechanism in d = 2 unlike in d = 3. The prior work 
by Limic and Sturm (2006) generalised results by Greven et al. (2005) who considered spatial 
Kingman coalescents in d > 3 (for some results in d — 2 in the analogous setting but with a 
different focus see also Greven et al. (2012)). 

The observation that the influence of spatial structure on the genealogy of a sample is not 
readily detectable in certain situations (except possibly through a space dependent time change) 
has been made before. Related results have in particular been obtained in the articles by Cox 
and Durrett (2002) and Zahle et al. (2005). They study the classical stepping stone model on 
the torus T L . This is a spatial Moran model and thus a special case of the spatial Cannings 
model introduced in Section [5] corresponding to setting N x = N and choosing = V s in |5j 
as a permutation of (2, 0, 1, . . . , 1). In Cox and Durrett (2002) and Zahle et al. (2005) meeting 
times and coalescence times of ancestral lineages are analysed as L tends to infinity while the 
populations size N and migration rate depend on L in an appropriate way. In this setting they 
also prove a result analogous to Theorem 4. 1 stating that the genealogy of individuals sampled 
far enough apart can be approximated by a non-spatial Kingman coalescent. 

A related model in continuous space has recently been introduced by Barton et al. (2010). 
Here, reproduction events are determined by Poisson point processes in space and involve indi- 
viduals in a certain neighbourhood. Rare extinction-recolonisation events lead to a genealogy 
that is described by a spatial coalescent with a A-coalescent mechanism affecting ancestral lines 
in the neighbourhood chosen by the Poisson point process. 

These A-coalescents in continuous space differ from the ones considered here in discrete 
space. Nevertheless, an analogous result to Theorem 4.1 is obtained when individuals sampled 
far enough apart on a two dimensional continuous torus are considered. As its side length L tends 
to infinity, the limiting genealogy is described by a non-spatial Kingman coalescent, another type 
of a spatial A-coalescent or coalescing Brownian motions with non-local coalescence, depending 
on the scaling of the neighbourhood sizes that affect multiple merger events. 

In this work, the behavior of the spatial A-coalescent when individuals are not sampled far 
apart has not been considered analytically. It is clear, however, that due to the recurrence of the 
random walk in d — 2 coalescence of all ancestral lines would happen instantaneously on the 
time scale considered in Theorem 14. 1 1 in the limit as L — > oo. This is in contrast to results in 
d > 3 as in Limic and Sturm (2006), where there is a nontrivial distribution of ancestral lines (far 
apart from each other) for all large L. 

For ancestral lines sampled close to each other the behavior of the ancestral lines is often 
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described by distinguishing two phases, the first phase termed the scattering phase that (likely) 
involves initial rapid coalescence and lasts until ancestral lines are far apart (and well mixed) 
in space, and the second phase of subsequent slow coalescence termed the collecting phase, see 
Wakeley (2001). Given this terminology, our analytical results are restricted to the collecting 
phase. For some analytical results on the genealogy for the scattering and collecting phase under 
various model assumptions, see Zahle et al. (2005), Etheridge and Veber (in press 2012), and 
Greven et al. (unpublished manuscript). 

In order to further examine the asymptotic results for the spatial A-coalescent we performed 
simulations of the spatial A-coalescent as well as the (non-spatial) Kingman coalescent on a 
moderately large two dimensional torus. We then compared the mean allele frequency spectrum 
of both processes which we juxtaposed additionally with its expectation in the Kingman coales- 
cent calculated using Ewens sampling formula. We also compared the total tree length of both 
processes via a q-q-plot. 

The allele frequency spectrum is computed under the infinite alleles model. In order to test 
how well the approximation performs for moderately large L mutations, that are assumed to 
always generate novel alleles, are placed on the genealogical tree of the processes according to 
a Poisson point process with the rate tt ■ ((2L + l) 2 log(2L + We start with n singletons 

at time 0. Whenever a block is hit by a mutation we count the number of individuals in the 
block. Thus, if there are k individuals in a block that is hit by a mutation a £-tuple of individuals 
carrying this mutation is generated. We then remove the individuals from the system. When all 
individuals are removed we count the number of ^-tuples, this number shall be called a^. The 
vector (a;, . . . ,a n ) is the desired frequency spectrum. We perform m independent simulations 
and generate aj, which represent the i-th component of the allele frequency spectrum of the j-th 
simulation. The mean frequency spectrum is now given by m £"Li( a j> • • • > a n)- 

First, we start with n — 9 individuals sampled far apart on the torus at time 0. We arrange them 
on the torus in a 3 x 3 square such that the distance of next neighbours is a third of the side length 
V =2L + l. 

Regarding the various coalescence behaviors we consider the (instantaneously) coalescing ran- 
dom walk, the spatial Bolthausen-Sznitman coalescent (A uniform on [0, 1]) and the structured 
coalescent (the spatial Kingman coalescent with A = do)- For the side lengths V - 99 and 
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q-q-plot of the total tree length for a 198 x 198 torus 
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q-q-plot of the total tree lengths for a 198 x 198 torus 
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Figure 3: q-q-plot of the rescaled total tree length of the Figure 4: q-q-plot of the rescaled total tree length of the coa- 
spatial Bolthausen-Sznitman coalescent versus the total tree lescing random walk versus the total tree length of the King- 
length of the Kingman coalescent for L' = 198 man coalescent for L' = 198 



L' = 198 we perform m = 100 independent simulations. 

Figures [T] and [2] show that the results for the coalescing random walk are much closer to the 
limiting non-spatial Kingman coalescent than those for the spatial A-coalescents. The spatial 
A-coalescents produce more singletons in the allele frequency spectrum in comparison with the 
Kingman coalescent. Moreover, they also produce a larger total tree length (compare figures [3] 
and [4]). 

Comparing figures [T] and [2] we note in addition that there is not much of an improvement 
between the L' = 99 and L' = 198 case, which is a sign for a slow convergence rate in Theorem 



4.1 This can be explained by recalling the proof of the result one more time. We show the 
convergence result first for the coalescing random walk. We then show that for large L the 
spatial A-coalescent behaves like a coalescing random walk since any pair of ancestral lines that 
meet in the spatial A-coalescent will with high probability on a large torus meet again and again 
(before meeting other ancestral lines) until the pair eventually coalesces. This additional time 
until coalescence vanishes in the limit. Nonetheless, for moderately large L it is not surprising 
that we see longer coalescence times for the spatial A-coalescents than for the coalescing random 
walks and therefore also more singletons in the allele frequency spectrum since singletons have 
more time to get hit by a mutation before coalescing. Note that the behavior of the Bolthausen- 
Sznitman coalescent and the structured coalescent are quite similar since it is mostly A.2,2 that 
dictates how much longer it takes the spatial A-coalescent for coalescence, and we have A.2,2 - 1 
for both of these choices. 

The slow convergence rate may be explained by looking at the expectation of the time that it 
takes for two blocks to leave the same site and then meet again. This expected time is of order 
L 2 and after rescaling it is of order (logL) . Therefore, these expectations converge only very 
slowly to zero. 

Next, we consider individuals sampled close to each other on the torus and their behavior 
under the spatial Bolthausen-Sznitman coalescent. Once the ancestral lines are mutually far apart 
again, we also use the approximation by a non-spatial Kingman coalescent. Therefore we get two 
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Figure 5: Mean frequency spectrum of the Bolthausen- Figure 6: Mean frequency spectrum of the Bolthausen- 
Sznitman coalescent and the Approximation with the King- Sznitman coalescent and the Approximation with the King- 
man coalescent with ancestral lines started close to each man coalescent with ancestral lines started at the same posi- 
other for V = 99 tion for V = 99 

results for the allele frequency spectrum, one given by the simulation of the spatial process and 
the other given by simulating until ancestral lines are far apart and then approximating with the 
Kingman coalescent. 

Again we have n — 9 individuals to start with, the side length of the torus is L' = 99 and we 
repeat the simulation m = 500 times. We start the approximation when the ancestral lines surpass 
a mutual distance of 8.33. We either start the particles in a 3 x 3 scheme where next neighbours 
have distance 1 or we start them all in the same site. 

For both cases we compute the mean frequency spectrum (see figures |5]and|6]). In comparison 
with the sparse situation (figure[T]i we observe that the mean of a\ and 02 decreases and the mean 
of ag, a-j, a& and ag increases. This effect is expected and is even stronger for individuals started 
at the same site. 

We observe again that the approximation with the Kingman coalescent gives less weight to 
singletons but since the number of ancestral lines left in the system when it has become sparse 
will be very small (possibly 1) the approximation is sometimes not even used so that the results 
appear to be better than in the case where we already start far apart. Here, the mean number of 
ancestral lines at the time of approximation was 2.6 for the close starting configuration and 4.0 
for all ancestral lines started in the same site. 

Finally, we note that if ancestral lines end up far apart the simulation of the spatial A-coalescent 
will take significantly more time than the simulation of the approximation with the non-spatial 
Kingman coalescent, and thus could be of practical use. Here, the simulation with the approxi- 
mation was more than 100 times faster than the simulation without the approximation. 

In conclusion, we recall that our theoretical convergence result states that information about 
spatial structure and the specific coalescence mechanism (due to possibly larger variances in the 
offspring distribution) is lost from a sparse sample as L — > 00, except for a time change that 
depends on L. However, we have seen from simulations that even if this time change is not taken 
into account the influence of the spatial structure is still quite evident for moderately large L. 
In contrast, the details of the coalescence mechanism -apart from the delay that it takes a pair 
of lineages to coalesce while at the same site- do not have a significant impact on the sampling 
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distribution for moderately large L in the case of a sparse sample. 

In future theoretical work, it would be interesting to study the rate of convergence, and also 
whether it can be improved by a finer scaling that takes the coalescence delay into account. Also, 
theoretical investigation of the scattering phase for non sparse samples as well as the considera- 
tion of convergence as N and L tend to infinity jointly would be of interest. 



6. Proofs 

In this section we prove Theorem |3.1| and Theorem |4.1 



6.1. Proof of Theorem 3.1 



For the proof we fix the following notation. For a vector a = (a x ) xe c we set ||a|| — sup veG \a x \. 
For (a A ')iv e N and (c^weN sequences of vectors and nonnegative numbers respectively we say that 
a N = 0(c N ) if sup^N Uf^ll < oo, and a N = o{c N ) if \\m N ^ ^ = 0. If A = (a,y) Ue/ is a matrix 
for a countable / then we define the matrix norm by 



2 layl. (22) 



l|A|| = sup 

ieI Mi 

In this proof we only consider labeled partitions and thus omit the superscript I in the notation 
whenever possible. So for n < N let n, n e 9^ and let p^L = P(IT^ = ftl^ = if) be the 
transition probabilities of 11^. We set TV = (P%t)wjt&^- Also let q^ be the transition rate from 
7T to 7T for Ir if n + n and set q n „ = - Y^n+n 1™ as we U as Q — (fe)* ftep',, ■ F° r me P ro °f of the 
convergence of finite dimensional distributions it suffices to show that 

P N = I + c N Q + o(c N ). (23) 

The sufficiency follows since \\Pn\\ = 1 and also HZ + c^QH = 1 for AHarge due to the fact that the 
assumptions of the proposition imply that q„j, are uniformly bounded over all n, ft e P„. From 
this and ( |23| ) we then have 

\\pf ] - (7 + c"fi) [ £ ] || < ^\\P N -(I + c N Q)\\ = t\\^^--Q\\^0 (24) 



as N — > oo which implies that 



lim P f = lim (7 + c N Q) { ^ ] = e' Q . (25) 



This in turn implies convergence of the finite dimensional distributions. 



Thus, we will first focus on showing relation (23 1. For notational simplicity we will in the 



following omit a sub- or superscript N whenever this does not affect clarity such as for the 
quantities p 1 ^ and p x . We can write TV = P™ • P ( ^ where P^ = (p^ ) Kji€ pe and P^ = 
(P n n JnXep 1 are the transition probability matrices of the coalescent due to the migration step 
and reproduction step respectively. We first consider the migration step. The arguments will be 
similar to those in the proof of Theorem 2.1 in Herbots (1997). If n e A* the set of all partitions 
that result from n by changing one label from x to y with x + y, then we have that 

(N,m) _ \P*yN s -l) \ p x N x -p„N x ) pr ^ p.N z ) „ 

V ™ ( N x \ ' ( N x -p„N x \ 1 1 / N t \ ' ( ' 

\PxyNj \p x N x -p„Nj Z*x \ Pr N K ) 
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Here, the first factor is the probability of choosing one particular block (but no other blocks) as 
migrants from site y to site x forward in time. The second factor represents the probability that 
all other migrants from other sites to x are drawn from outside the sample. Lastly, the product 
represents the probability that no migrants at sites other than x are in the sample. We can simplify 
this expression to 



(N,m) 



Pxy 



N x - p x N x - #n x + 



T nn 

zeG a=0 



N, - p,N z 



N- 



(27) 



We also have that 



E 



(28) 



where R^ N ' m \n) is the probability that at least two of the migrants are drawn from the sample. As 
in Herbots (1997) we can bound 



2] #7r z^ 



VzeC 



< n~ sup p\ = 0((c N ) 2 ) 

zeG 



(29) 



by assumption. Taking (27 1 to (29 1 together now implies that 

,(»!) 



P w = I + c N ( 



(30) 



where the entries of Q [ 

(N,m) 



(m) , (N,m)-. ■ u (Mm) 

- (%a %,fefi are given by q)J 

1 (N,m) 



jwp { ^ m) if n e A" xy for some x + y, 

,(m) 



by q^l'"' 1 = - 2x#v Z^eA* piP^l' 1 "' anc ^ 816 zero otherwise, and R^ v is the matrix containing the 
rest terms. We have due to (28i and (29 > that \\R^\\ = 0((c N ) 2 ). Before turning to transition 



probabilities due to reproduction let us observe that 



lim q 

JV->oo ' 



(N,m) 



(m) 



(31) 



where erg = p(x, y) if fc e A* and is zero for all other fc + n. This convergence follows directly 



from (27 1 since the term multiplying p xr in (27 i is a finite product whose individual factors 
converge to 1 due to the fact that N z — > oo and p z = 0(c N ) — > by assumption for each zeG. 



From (31 



it then follows that q 



(HI) 



-Zj^S? so that g£> 



(a (m) ) 



is the matrix of the 



transition rates for the migration in the limit. 



Recall now that F$ = (p K ^~ r ') n Je pf describe the transition probabilities due to coalescence in 
the sample as a consequence of the Cannings reproduction forward in time. If fc arises from tc by 
merging 2<fc = #7r-#7r+l blocks with the same label x then (28) in Mohle and Sagitov (2001) 
states that 



Ms) 



lim — 

JV->oo c N 



JN,r) 



C 

lim 

iV-*oo c 



N 



N 



N 1 nn 



(32) 



with A x bk defined as in jlj with A x instead of A. For other fc + n we have that q^} :- 

limAi^oo 4r = 0. Setting = - Yjfc^n dnl we see tnat relation ( 32 1 implies the analogous 
statement to d30b for coalescence, 



p% = i + c N a 
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(r) 



+ R 



(/•) 



(33) 



where 1$ = 

nfitp 1 i s tne matrix of the rest terms. Since there are only finitely many (at 
most 2") non-zero entries in each row we obtain 2j e <p f kj^'^l = o(c N ). Taken together this leads 
to WR^W = o(c N ). This implies that 



/ + c N (Q N + R N ), (34) 



where 



g» = q';" + q">, es) 

with \\R N \\ -> as N -> oo since = ^((c^) 2 ) and = o(c N ). This and the conver- 



'-N 

dimensional distributions 



gence of Qy' to Q^ m) from Oil now completes the proof of (23 1 and so of convergence of finite 



In order to complete the proof of Theorem 3.1 which states convergence in the Skorohod space 



D(R+,f >l n ), it remains to show relative compactness in D(R + ,P e n ). According to Corollary 3.7.4 
in Ethier and Kurtz (1986) we need to show the following two conditions: 

(i) For every s > and t > there exists a compact set r Ejf c such that 

liminfP(nff . e T et ) > 1 - s. 

(ii) For every e > and t > there exists a 6 > such that 

lim sup P{w{n e ' N ,,6, T)>s)<s 



where 



w(Tlf'% <5,r) = infmax sup d„(nf'"n •"), (37) 



and {?,} ranges over all partitions of the form = to < t\ < ■ ■ ■ < t^-i < T < tk with 
mini<,<^.(f/ - > 5 and k > 1. 

Since is finite, it is compact and condition (i) is trivially fulfilled. To prove condition (ii) we 
set C := || 211 which is finite by assumption. Let a# — c^(C +1) and let in the following N be 
large enough so that < 1 and \\Qn + R^W < C + 1. Now consider the discrete time Markov 
chain (Ef,ef) te ^ which does not change its state with probability 1 - a N , changes only ef to 
<?»+i = ef + 1 with probability a N - J^t* p^ > if E^ +l = Ef = n. Finally, we have e^ +1 = ef +1 
and E^ +l = n given that Ef = n with probability p^. This means that the process E N is just a 
version of Tl e,N if the time between the steps is chosen to be c N . We have that the times between 
jumps of the Markov chain (E N , e N ) are given by i.i.d geometric random variables t n with mean 
— . Since w(ef „6,T) = whenever there exists a J such that t n > -4 for i = 1, ... ,J and 

a N v [py]' ' i c 

Y/i=\ T f - Jv it suffices to show that we can find J and 6 such that 



liminf P(rf > — for z = 1, . . . , J and Y if > — ) > 1 - n. 



;=i 
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This can be achieved since 



nrf >^f0Ti=l,...,Jmdf J T»>^) 

= ^ >^fori=l,...,J).P(Tf >Afor/=l,...,J) 

> P(J] rf > • P(rf > -» PCZ < -/) ' P(X > 5) ] , 
i=\ 

where Z is a Poisson random variable with mean TiC + 1) and X is exponential with mean g^-. 
This finishes the proof. □ 

6.2. Proof of Theorem \4. 1 \ 

In the following let Tl e,L be a spatial A-n-coalescent on T L with transition kernel p L . We 
will write V to indicate that Tl tL is started in n l ' L e !P^' L at time 0. We want to show that 
asymptotically lL' L behaves like a coalescing random walk for which only ever two blocks meet 
and coalesce instantaneously, and that the meetings only take place after the blocks have been 
"randomized" in space such that any two are equally likely to meet at any such event. In order to 
prove this we define another process of coalescing random walks Tl e,L coupled to II ,L (defined on 
the same probability space) and show that asymptotically lT' L behaves like fl ( ' L and that fl f,L has 
the desired properties. Thus, the evolution of Y1 L (unlabeled) is given by a non-spatial Kingman 
coalescent as L — > oo. 

We first couple the spatial A-n-coalescent with a coalescing random walk. Let L e N and let 
(£(0 L )(e[«] be an i.i.d. family of random walks on T L with transition kernel p L . We now define a 
version of the spatial A-n-coalescent by stipulating that the label of a block B follows the random 
walk £ L (min(fi). 

We now require some notation. Let i, j € [n] and t e R+. Let Af (z) define the unique block of 
Ilf' L with i e Af (/) and let M\ (z) = £ r £ (min(Af (i))) be the location (label) of this block. We set 

r L {i,j) := inf{f >0|M, L (/) = M, L 0-)}, (38) 
T L c {i,j) := inf{t>o|Af(0=Af (;)}. (39) 

In words, these are the first times at which the two blocks containing i and containing j meet and 
coalesce, respectively. We also define the jump times 

= < t\ < ■ ■ ■ and = t£„ < < • • ■ (40) 

where 

:= Mlr L (i,j)\r L (i,j)>4} 

i,je[n] 1 ' 

is the first meeting time after of blocks that have not met before and T L k ^ defined analogously 
is the first time after t l , at which blocks coalesce. 

C,K 

Assuming that no two blocks of the starting configuration have the same label, we define the 
coalescing random walk II • by first setting n o ' = n o ' . The label of a block B in II ' is 
following the random walk £ L (min(B)) and as soon as blocks have the same label they coalesce 
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instantaneously. We set M L and A L as before, now for the process W' L , We define the quantities 
in (38 1 to (40 1 analogously for the process fl CL but note that f L (i,j) = f^(i,f) and ff = f 1 ,. 



Furthermore note that ff = rf for k — 1 but that the analogous statement is not true for k > 1 
since blocks meeting in the spatial A-n-coalescent could part without coalescing. We will see 
later though that ff = rf becomes very likely for large L if the blocks are initially sampled far 
apart. 

We will now prove that the spatial A-«-coalescent behaves after rescaling like the coupled 
coalescing random walk. We define for any random walk Z L on T L with transition kernel p L the 
stopping times 

H L :=inf{f >0|Z, L = o}, (41) 
which is the first hitting time of the origin and 

W L := inf [t > | Z\ = Z\ = and there is an s e [0, t] with Z L S + o} , (42) 

the first return time to the origin. We also use these definitions for L — oo with the convention 
that T°° = Z 2 . 



Lemma 6.1. Let i, j e [«]. Then a.s. 

A-l 

T^(i, j) - r\i, j) = r coal + ^ Wf, (43) 
1=1 

where W^,Wi, . . . are Ltd copies of W L of a random walk Z L on T L with transition kernel 
2p L . The random variables T coa i and A are independent from {Wf},<=N and from each other and 
T coa \ ~ exp(/l 2> 2) as well as A ~ geom(/l2,2(2 + ^2,2) _1 ) ■ 

Proof. It is clear that (M\ (j) - M, L (/)) fe R r , the distance between the blocks containing i and j, is a 
rate 2 random walk with transition kernel p L until the coalescence time t^(/, j). Using the strong 
Markov property we restart this random walk at time T L (i, j), at which the blocks containing ; 
and j meet for the first time. Therefore we restart in 0. The time until one block migrates away 
from the other block is exponentially distributed with parameter 2, the time until they coalesce 
while in the same site is exponentially distributed with parameter ^2,2- Thus, the probability that 
the two coalesce before they part is given by ^2,2(2 + ^2,2) 1 and this then happens after a time 
T CO ai- If they part it will take time Wf to return to the same site. Using the strong Markov property 
we can repeat the argument leading to A - 1 independent return times Wf with A as above, and 
thus to the statement of the lemma. □ 



As an immediate corollary we obtain that on the time scale that we are considering (recall 
the definition of si in ( |2"0"| )) the time from the first encounter of two blocks until their eventual 
coalescence is asymptotically short. 

Corollary 6.2. Let i, j e [«]. For all e > and 6 > there is an L(s, 6) € N, such that for all 
L > L(s, 6) and for all tt C ' l e V^ L we have 



^( S l 1 \^(i,f)-T L (i,f)\>s)<d. 
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Proof. Let L e N and n [ ' L e P^ L . If A (0 = A (;') then if (i, j) = = t l (/, and the claim 
follows. Now let A (0 * A (j) and let W7 , W£° 



6.1 



for L — oo. If we 



be as in Lemma 

consider the random walk on T L as the projection of a random walk on Z z that defines the Wf 
then it follows that Wf < W? 



With Lemma 



6.1 



we have for all n c - L e P e ' L that 



t (5Z 1 Ka > 7-)-r L (i,7-)|>e) = P' rf ' 



Tcoal + W i ~ SlE 
V i=l 
A-1 

Tcoal +^W°° > S L E 
i=\ 

Since the random walk on Z 2 is recurrent we have W°° < oo almost surely. Likewise T coa i and A 
are almost surely finite. Since all these random variables do not depend on L and n c - L the claim 



follows now with Sl 



□ 



Let n l ' L e V 1 ,; 1 and let every two blocks in n e ' L have different labels. It is clear, that in the 
coalescing random walk fl e,L starting in n C ' L almost surely only pairwise mergers happen. So for 
k e [#n e,L - 1] we have that f f < oo is a time at which two blocks in Yl C L merge. By swapping the 
random walks belonging to the two blocks merged at time ff after time ff we can define a new 
coalescing random walk which agrees with fl c - L until time ff and which has the same distribution 
as Yl e,L . More precisely, this means that if B\ and B2 are the two blocks that meet at time fi with 
min B\ < min B2 then the label of the newly created block follows the motion £, L (min B2) rather 
than ^, L (minBi). Let f^fP be the first meeting time of blocks in this process after time ff. For 
q > we consider the event 



B L := B L (q,n e ' L ) := 



k+l 



> s L q, f , L ; ( f - & > s L q for all k e [#/' L - 2]} 



(44) 



on which (after rescaling) it takes some time until a block that is created in a coalescence event 
meets further blocks whether it follows the motion of the first or the second block that took part 
in the coalescence event. Note that for #n - — 2 we have B L (q, n C ' L ) = Q since no conditions 
need to be met. 

Proposition 6.3. Let q > 0. For all e > 0, 6 > and v > there are constants L(s, 5, v) € N so 
that for all L > L(s, 6, v) and all n e,L e f e ,; L with #n e ' L > 2, such that no two blocks ofn e ' L carry 
the same label, we have 



as well as 



^ ■ •,<„-!) - Of, . . • . ^-i)|| * s,BH q ,n^)) < 6 

s*'- L ^n^ L + fKi for some k - 1, . . . , #n C ' L - 1 , B L (q, n u ^j < v. 



(45) 



(46) 



Note that the last statement ( 46 1 is about the unlabeled partition structures of the two processes. 
For proving this statement we set for Tl e,L started in n e ' L 



C L (0) := {(i,f) e [n] 2 \ i ^ ;} 



(47) 



the set of index pairs that are initially in different blocks 
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Proof. For any q > there are almost surely exactly #n f ' L - 1 coalescence events of fl C L started 
in n e,L since at any coalescence event exactly two blocks merge. We will first show the following 
statements by induction in k e [#n c ' L - 1] : 
There is an L(s, 6, v) e N such that for all n e,L and L > L(e, 5, v), 



f L k >s L E,B L (q,n e - L ))<6, 



(48) 
(49) 
(50) 
(51) 
(52) 



r' L (^ k <^,B L (q,n e - L ))<6, 
V' L (T^>T L k+1 ,B L (q,n C ' L ))<v, 

V''\U e f + n C 't,B L (q,n t - L )) < v. 

These statements will then complete the proof since (|45| ) follows immediately from ( j48j ) and ( |49] l 
for all k e [#n e - L - 1], Likewise, @ follows from @ |50), and ((52} for all k e [#t7^ - 1]. 

In order to start the induction let first k = 1. We begin by showing ( |48] l. Due to Corollary |6.2| 
we can find L(e, 5) e N with 

F^ft ;) - r L {i, j) > s L s for all (i, j) e [«] 2 ) < ^ 

for all L > L(s,S) and all jr f ' L e !P, f , x with #n e - L > 2. Let (/,;') e C L (0) so that r^(/,;') > t l cV 
Thus, on the event A,-j := {t^ = t l (/, _/)} , it follows that 



(<, -rt > Sifi.Ay, B (q,n • )) = F" (t^-t^,;) > s^s, Ay, B {q, « )) 

< P^(T^(j,j) - r L (/,j) > SL e)< 4- 



Hence, with P* (Ay for one (i, ;') e C L (0)) = 1 we have 

F^«i - > B L (q, n e ' L )) < " T t ^ *U*> A U> ^)) < S 



for all L > L(e, 6) and all tt C ' l e V„ with #n e ' L > 2, which is (48 1 since t\ = t\. 

Again because of = statement (49 1 is trivial for k — 1 since r^j > Tj by definition. For 
showing ( 50 1 assume that #n C ' L > 3 as the statement is immediate for #n e ' L = 2. Observe that we 
have =~f| or = f^ 2 \ In both cases the arguments are completely analogous so it suffices 
to show the result for f| instead of r£, and hence to consider the event rf*, > t\. 

We have t\ — t\ < f \ < oo almost surely. From ( 48 1 for k = 1 and s — g we obtain that there 
is an L(v) = L(g, v) e N so that for all L > L(v) and n e - L e 9^ with #t/' l > 3, 

P" (t* > f£, fi% ^)) = « x - > f£ - ff > s L q, B\q, ««■)) 

<¥"' X (T L c X -f L x >s L q,B L {q,n t - L ))<v, 



which proves ( 50 1. 



For showing ( 5 1 1 and ( 52 1 assume again that #n e,L > 3 as the statement is trivially true for 

folli 
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#7r f,L = 2. We observe that from t l , < t~ it follows that there is a coalescence event for Tl c,L 

CA 2 



tl e f and t\ 



before any additional blocks meet. Thus, exactly the two blocks that met at time will have 
coalesced at time r L , before meeting other blocks implying TL e f 
and d52l follow from d50 



Thus, (51 



Let the claim now be true for k e [#7r f,L - 2] and let us consider k + 1 e [#n e - L - 1]. We can 



assume T L ck < f L M 



T7" , and IT 

k+1 r 



n 1 



l.L 



since these events have by the induction assumption 

asymptotically probability 1 . Using the strong Markov property of lT L and fl CL we can restart 
both processes at time t l , (note that for k + 1 all statements only concern the processes at times 
greater than t 1 ,). Since on the above events both processes start in the same partition (for which 
different blocks have different labels) the proof for k + 1 works now analogous to the case k = 1, 
The randomness of the new starting point poses no problem since we have uniform results (or 
alternatively because the space P% L is finite). □ 

In order to study the asymptotic behavior of fl f L and also of Il f,L we need some results on 
coalescing random walks that were already shown in Cox (1989) 

Proposition 6.4. Let Z L be a simple symmetric random walk on T L . Let (ol)l€N be a sequence 
of nonnegative numbers with 

lim ZT 1 ylogL a L = oo, (a L ) LeN = o(L). 



Let H be the first hitting time of the origin as in {41 L Then 



lim sup 



Ml>oi.fi>0 



0. 



This is the special case d — 2 of Theorem 4 in Cox (1989). 
Corollary 6.5. Let i + j e [n]. Then we have 



lim 

L— >oo 



sup 

,rxj,t>0 



>t\ — exp {-Tit) 



= 0. 



Proof. We know that Mf(i) - M\(J) (here the subtraction is done with respect to the cyclic 
structure of the torus) is up to time T L (i, j) a simple symmetric random walk on T L with jump 
rate 2 and transition probabilities given by p L that is started in z := M^{i) - M^( j) e T L with 
||z|| > fl£ due to 7T f ' L e [[aL, V2L]]. The time r L (i,j) is just the hitting time of the origin of this 
random walk and so the claim follows from Proposition 6.4 □ 



A more general statement holds for an arbitrary number of blocks that perform a coalescing 
random walk. 

Proposition 6.6. Let ff = rf be the first meeting time of any blocks. Then for any neN, 



lim 

L— >oo 



sup 

TT^Wat, ^2L]]rfP e „' L ,t>0 



^(r[>s L t)- SW [J # f)t 



= 0. 



For #ti C ' l = 2 this is the statement of Corollary|63] for #n LL > 2 see (3.2) in Cox (1989). 
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Lemma 6.7. Let n e N and let Kh, c f^ 1 for C c [«] be those labeled partitions n e ' L ,for which 
the blocks (Ag(0}jec flre pairwise different. Let T > one/ i,j,k,l e [n] fee pairwise disjoint. 
Then we have the following statements: 

(i) For I L nLL := jf' 1 P* a (ff = f L (i, j) e du, r L (Mf, (/), #£(*)) < a L ) we have 

lim sup I^ rL = 0. 

L ^ 00 ^ t E[[ flz .,V2L]]n^ jt| 

fii; For := g SL P^ fX (f[ = f L (/, j) e du, r L (M^(k), M£(/)) < a L ) w We 

lim sup J L [L = 0, 



Proof. Since the structure of the blocks is irrelevant for the statement it suffices to consider 
TIq L = n C ' L = {{/} |ie [«]} . The two statements for the coalescing random walk fl e,L are now 
(3.7) und (3.8) in Cox (1989). □ 

tends to 1 as L is 
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The next result states that the probability for the event B L (q, n e ' L ) of 
large and q is small, provided the blocks are sampled far enough apart. 

Proposition 6.8. For any s > there is a q(s) > and L(e) e N such that for all L > L(s) and 
alln e ' L e [[a L , V2L]] we have 

r' L {B L {q{E),n t ' L ))>\-E. 

Proof. We will show the following statements by induction over k e [#n e ' L - 1] for any s > 0. 
There is a q(e) > and L(e) e N such that for all L > L(s) and all n LL e [[a L , V2L]] with 
#7r f,L > k we have 

P""(n"e[[flL, V2L]])>l-e, (53) 

P^Ot-?*-! >*£9(e))> 1-e. (54) 

In words, these events say that at the time ff , the blocks are again far apart with high probability 
and the probability of meeting times being very close together is small. The proposition follows 
then from (54 1 for all k e [#n t,L - 1] since ff - ff_, and ff/ 2) - ff, have the same distribution. 

Note that since in fr' L almost surely only pairwise mergers occur, all of the ff above are actual 
jump times of the process. 

We start with k = 1 and first show p3j ). From Proposition |6.6| it follows that there is an 
L(e) e N and a T(e) > such that 



^(ff > s L T(e)) < 



for all L > L(e) and all n t,L e [[«£, V2L]] with #77 /,L > 2. According to Lemma 6.7 by choosing 
Lie) larger if necessary, 

V n ' L {f\ e [0,s L T(s)]M L * [[a L , V2L]]) < ? 

T i z 
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for all L > L(s) and all n { ' L e [[a L , V2L]] with #n C ' L > 2. Thus, for those L and n 1 ' 1 
¥* c \flf i [[a L , V2L]]) 

< e [0, s L r( e )],m L « [[a L , V5L]]) + F"(rf > s L r(e)) < e. 



In order to show (54i we note that by Corollary 6.5 we can choose q(s) > small enough 
and make L(e) e N larger if necessary such that for all L > L(e) and all n c ' L e [[a L , V2L]] with 

#7T e - L > 2 

\r< eL ( S - L 1 T L (i,j)<q(sj)\ < 4- (55) 



With C L (0) as in (|47j it follows that 

F^ff - f£ < s i9 (e)) < J] P*"^ 1 ^ = sl^ft 7) < 9(e)) 

(/jleC^O) 

< J] F tf (*Z 1 T t (i,/)^9(e))< X J<«- 

(JJ)eCHO) (U)eC L (0) " 



Let the claim now be true for k e [n - 2]. Since ( |53| l holds for £ we can assume 

fiff € [[a if V2L]]. 

Using the strong Markov property of n f L we can restart the process at time f f . Since the blocks 
are again well seperated the induction step now follows analogous to the k — 1 proof. Note that 
the random starting point does not pose a problem since we have uniform results (or alternatively 
because is finite). □ 

We will need the following properties of a random walk on the torus, which states that asymp- 
totically the random walk will be uniformly distributed. 

Proposition 6.9. Let Z L be a simple symmetric random walk on T L with transition kernel p L . 
Let (?l)l€N be a sequence with lim^oo t L — oo. Then 



\2 \ „L{ 

_L T 1 

1 t>t L (2L+\) 2 xeT L 



lim sup sup(2L + if \pf(x, 0) - (2L + 1)~ 2 = 

L— >oo 



This is (2.8) in Cox (1989). From Proposition 6.9 we obtain the asymptotic exchangeability 
of the blocks of the coalescing random walk. 

Lemma 6.10. Let jiq e f n be a partition with > 2 and for L e N large enough let n e ' L e 
[[flz,, V2L]] be a labeled partition, whose partition structure equals ttq, meaning that n L = no. 
Furthermore, let c be a permutation of [#tto] and if* 1 * 1 - the labeled partition that is obtained 
from n e,L by permuting the labels of the block with a. We define for L € N 

?i :=(loglogL)(2L+l) 2 . 

p l L— >oo 

Let k € [#7To — 1], then there exists a sequence (<5l)leN> independent of n • and o~, with 6l — > 
and 

KkK))-^('(i%))\ <s * 



for any measurable and bounded f : P^ 1 
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Proof. Let m := #ttq and n > L - {(2?i,£i), . . . , (B,„, f m ), . . . }. We can couple the coalescing ran- 
dom walks started in n e - L and 7r°" f i in a natural way by using the same motions for the cor- 
responding blocks that start with the same labels. Thus, all times f£ = f^ k are identical for 



1. Since s L qi — > as L — > oo we have by Proposition 

¥*'\ qL <f[) = ^ L { qL <f\) L ^l 



6.6 



that 



(56) 



uniformly over all n e ' L ,n a ~ ,c - L e [[«£, V2L]]. Thus, due to the boundedness of / it suffices to 
show that claim on the event \qi < < ooj . For any z = (zi, . . . ,z m ) G (t l ^ we set ^ ; = 
((fii,Zi), . . . , (B m ,Zm), (0,5), . . . ). While {f < g L < r^j the blocks perform independent random 
walks with transition kernel p L . Thus, we have due to (56 1 that 



E*"(/(n$f))= 2 /(7r z )r ,l (n^ = 7r z ) + (i)= J] /(^)f]/4&-z,,0) + O (i), 

*(7*)" ze (r0'" 



where the o(l) term converges to uniformly for L — » oo. We also have the analogous statement 
for 7f' l ' L with o-((k) instead of Hence, 

K(/(n^))-E^(/(n^))| 



s 



Vjfc=l 
/ m 



f(*z) 



V&=1 
/ m 



(2L + 1) 



-2m 



+ 2 i/wi n^^ )_zt ' o) 

«(^r 



<2||/IL sup 

ze(r z. )D , 



V/t=l 
/ m 



\\(2L+\) 2 p L qL { Zk ,Q) 



\k=\ 



- (2L + 1) 



-2m 



+ 0(1) 



where we have used in the las t ine quality that there are (2L + l) 2m terms in the sums of the 
previous line. With Proposition 6.9 (set t L := log log L)we have (2L+ l) 2 p^(z,0) — > 1 uniformly 
in z and so the claim now follows since the right hand side converges uniformly over all n e ' L and 

□ 



We now introduce some notation for the waiting time between meeting and coalescence times 
of blocks. Namely, for 7r f ' L G P„' L we define for k e [#n L - 1] the waiting times 



t-i> 



(57) 



(Formally, set oo - oo = 0.) Analogously we define the waiting times erf of fl f ' L . For txq g ^ let 
(K"°) leR be the non-spatial Kingman coalescent started in no and let be its fc-th coalescence 
time for k e [#7Tq - 1]. We set Tk,o '■- and define for k G [#n - 1] the waiting times 



Uk ■= T K ,k ~ T K ,k-\- 
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(58) 



Note that the family of random variables {Uk)ke[#n -\] i s independent, and that 



E/jfc ~ Exp 



#n -(k-l) 



Note also that they are independent of {K"° Kl }i l e[#n -\} as well as that K*° Kk results from by 
coalescence of two blocks chosen at random. 

We will now be able to prove a result about the asymptotic behavior of the coalescence times 
and the types of transitions for coalescing random walks. Note that the result refers only to the 
partition structure fl L of the coalescing random walk and not to the labeled partitions W> L . 

Theorem 6.11. Let no e V n with m :— #ttq > 2. We consider for every L e N large enough a 
labeled partition n e,L e [[a^, V2L]] with partition structure n L — no- We then obtain convergence 
in distribution for L — > oo, 



-,n 



— ,Tl L L 

SL 



,nU m -i,K* 



,)• 



Proof. The proof will follow along the lines of the analogous result in Limic and Sturm (2006) 
for (Ui, . . . , U m -\) in dimension d > 3 and also use the results of Cox (1989) for coalescing 

L— >oo 

random walks. We first show that there is a sequence (s/Jlen with el — > such that for all 
k £ [m- 1] and n C ' L e [[a L , V2i]] as well as for all u > 0, 



f * (t[ > s L u) - exp(-7rj 



#n — k 



< el- 



(59) 



Note that since no random walk jumps happen at the same time and since coalescence is instanta- 
neous, no more than two blocks may meet and coalesce, so that #Y1 L L = #n — k for all k e [m-l]. 
For those k we also set 

D\ := (n^ L e [[a L , V2L]]| . 

Then, for all u > we obtain 



< E^ll 



fl "/-L \ / i#n-k\ 
T t [t 1 > i L «j-expl-7rl \u 

** (f j > SLuj - exp J -n\ 



#n - k 
2 



+ 2P"'' L {(D L k ) c ) . (60) 



Proposition 6.6 and Lebesgue's dominated convergence theorem imply that the first term on the 
right hand side converges to zero uniformly in n e - L and u. The se cond term on the right hand side 
of (60 1 converges to uniformly in n^ L and u due Proposition 6.8 (see (53 1 in the proof). This 
shows (1591. 



We will now show convergence in distribution. For t\,...,t m -\ > 0, tti, . . . , n m -\ e P„ and 
k € [m - 1] we define the event 



A L — 



> t k ,tVb =n k , 
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sl t i 



Also set := £1 Since only binary mergers are possible for coalescing random walks as well as 
the Kingman coalescent, it suffices to consider n k chosen such that tt* results from tt^-i through 
a coalescence of exactly two blocks for all k e [m - 1 ] . We define 

C* := {(minBi, min B2) I Bi, B2 are blocks of n k , minBi < minZ?2} ■ 

Thus, we have \C k \ - ( # 2*) = f^*). Let e C k -\ \ C k meaning that on the event A\ at time 
ff the blocks containing i and j in the partition n k -\ merge to form the partition n k . Thus, on the 
event Af_, we have {f& = n k ) = {ff = f L (i, j)}. We set q L := (log log L)(2L + l) 2 for all L e N 

and assume that we consider an L large enough so that qi < sit k for k e [m — 1]. Let :F f L be the 
cr-algebra generated by fF' L up to time t. Then A£_j is TfL ^ measurable and we have that 

V'\A[) = %=ni) ■ l AU \Tii )) 

= w'' L (r u (dj > sjf k , t\ = f\u i)\f^ ) i Ali ) 

= W' L (p i> [f\ = f L (/,j) > s L t k ) \ a l \. 



(61) 



By using that {ff > s L t k ) 



{ff > q L ,f\ 



<Jl > s Lh - 1l) we obtain from conditioning on the 



information up to time qi that 

p n £, (ff = f L (i, j) > s L t k ) = e"£. (/ (fl«f ) l {ff>?l} ) , (62) 
where the function / is defined by 

/ : r e „ L -> r, 7T ^ r (f[ = f L a, j) > SL t k - qL ) . 

Note that / is measurable and bounded. For (/,m) e C k -\ let cr; im be the permutation of [#^-1] 
that only swaps i with / and also j with m. Let jt ~' j " ,( ' l be the partition obtained from n e ' L by 
permuting the labels with o" /,m (as in Lemma 
and d62|, 



6.10J. We also set A\ := A\ nD^. Then, due to 



(Af)-E* 



Z 



\(l,m)eC k 

n 



\C k - 



-1 l.rrr • 



e ^(/(ft^)i {ff>?L) )-E 4-. (/(n^)i in>9i) ) 



+ 2?"' ((£>f ,) c ) 



Since due to ( |59j l the probability for the event {ff > qi\ is arbitrarily close to 1 for large L, 
we have that the first term on the right hand side converges to as L — > 00 because of Lemma 



6.10 and Lebesgue's dominated convergence theorem. The second term converges to due to 



Proposition |6.8| Thus, we obtain 



yj « 



+ 0(1). 
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Note that for the process started in f& m ' t ' L the event fff = f L (i,j) > sitk) is the same 

T A-I 

as the event {ff = f L (l,m) > s^tk) for the process started in tv{f . Since the events 

T i-i 

^f L (l,m) = ff, (l,m) e are a partition of the probability space it then follows that 



«'\A L k ) = \c k ^r l w'- 



» & (ff > j£ f t ) l^ U o(l) 



= |C, 



t-il 



ti (ff > s L ^)-exp -?rj 



m-(k- 1) 



1 



+ ICfe-il exp|-7r| 
It now follows from d59ll that 



(m-(k- 1) 



lm-(k- 1) 



fit P"' (At,) + o(D 



exp -7T| 



m-(fc-l) 



f t ) lim P"* 1 ^). 



exp -?r| 



lim P^Af) = f 
By induction we then have that 

Urn F^(Af) = lim Ff 

i=l 

= p] P(^», = jr,!^ = • PfrUi > n). 

!=1 

Since this is the desired quantity this finishes the proof of convergence in distribution. 
We now formulate the corresponding result for A-n-coalescents. 



m — (i — 1) 
2 



m - (i - 1) 
2 



□ 



Corollary 6.12. Let no e f„ with m — #ttq. We consider for L e N large enough a labeled 
partition n e,L e [[a L , V2L]] with corresponding partition structure n L — ttq. Then there is 
convergence in distribution for L — > oo, 



— ,U L L 



XL 



Proof. For L € N we set 



V L := 



— ,n L t ,. 

<1 



^tn L t 



Si 



m— 1 



)■ 



Due t o Propositions 6.3 and 6.8 we have that V t L - V L converges to in probability. From Theorem 
6. 1 1 we obtain that V L => Vk- Taken together this implies that => Vk as required. □ 



We are finally ready to prove our main result, Theorem 4. 1 
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Proof. For any fixed n we prove convergence in the Skorohod space D(M + , ?>„), 

by first showing relative compactness and then weak convergence of the finite dimensional dis- 
tributions, see Theorem 3.7.8 of Ethier and Kurtz (1986). We start with the relative compactness 
of ((n^ L ,), E R + ) LeN . Since V,, is compact it suffices to show by Theorem 3.6.3 of Ethier and Kurtz 
(1986) that for all T > and s > there exists a 8 > such that 



limsupP^'VtlTT 8, T)>s)< s. 



(63) 



L— >oo 



where wljl% , 8, T) is the 5-modulus of continuity of IT^. on [0, T] (see also (|37j)). Namely, it is 
the infimum over all partitions of the form = to < t\ <•••<*£-! < T < t k such that f, - > 8 
for all 1 < i < k of the quantity 



max sup 1 V d n {U\ U^X 



r,t€[ti-iJd 



We set T > and 8 > as well as 



ci := max-!7r 



l#no - (Jfc - 1) 



Jt e [#7T - 1] 



Due to Corollary 6.12 there is an L(8) such that for all L > L(8) and all k e - 1] we have 

¥ n '' L (s L l o-^ k < 28 for one k e [#n Q - 1]) <S + V n °{nU k < 28 for one k e [#n - 1]) 

<8 + (#n - 1)(1 - exp(-2ci<5)) < (1 + 2c I (#7r ( ) - 1))& 

Since (II^,), e r + is constant on the intervals s£ t^v) we have on the event that s^cr^, > 

25 for all k e [#7To - 1 ] that w(Tl^ L ., 8, T) = 0. Hence, ( 63 1 and so the relative compactness follows 
by choosing 8 = e(l + 2ci(#no - . 

Lastly, we show the weak convergence of the finite dimensional distributions. Let I e N and 
t\, ...,ti e K+. By definition 



(64) 



We define for k e [Z] the function f k : (R+ x T 5 ,,)*^ 1 -> P« by 
(ri,^i 1 ...,r ftro _i 1 jr#„ _i) h-> 



7T#7ro-l » f £ 6 



e [0,ri) 

^iry,2^r;),i<*ro-l 

v*fo-l \ 



We define the function 



/:=(/!,...,/,): (i + xf>„) 



#?ro-l 



Let V C L and W be defined as in the proof of Corollary 



6.12 



Due to ( 64 1 we have 



/(v*) = (n^,...,n^), 
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and likewise 



f(v K ) = (K 



Since Ut are continuous random variables we note that the event that Vk t akes v alues in the 
discontinuity set of / has probability 0. Thus, due to Vjf => Vk from Corollary 6.12 we obtain 



(n 



S L t, ' 



' Ll Srt,J 



f(v K ) = (K 



This finishes the proof. 



□ 
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